Resolving Ambiguities in Toponym Recognition in Cartographic Maps
Identifieur interne : 001491 ( Main/Exploration ); précédent : 001490; suivant : 001492Resolving Ambiguities in Toponym Recognition in Cartographic Maps
Auteurs : Alexander Gelbukh [Corée du Sud, Mexique] ; Serguei Levachkine [Mexique] ; Sang-Yong Han [Corée du Sud]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2004.
Descripteurs français
- Pascal (Inist)
- Ambiguité, Cartographie, Document graphique, Donnée textuelle, Graphisme, Méthode heuristique, Programmation automatique, Reconnaissance automatique, Reconnaissance caractère, Reconnaissance forme, Reconnaissance graphique, Reconnaissance optique caractère, Représentation graphique, Source information, Texte, Traitement image, Trame, Utilisation information, Vectorisation.
- Wicri :
- topic : Cartographie.
English descriptors
- KwdEn :
- Ambiguity, Automatic programming, Automatic recognition, Cartography, Character recognition, Graphic document, Graphical recognition, Graphics, Graphism, Heuristic method, Image processing, Information source, Information use, Optical character recognition, Pattern recognition, Raster, Text, Textual data, Vectorization.
Abstract
Abstract: To date many methods and programs for automatic text recognition exist. However there are no effective text recognition systems for graphic documents. Graphic documents usually contain a great variety of textual information. As a rule the text appears in arbitrary spatial positions, in different fonts, sizes and colors. The text can touch and overlap graphic symbols. The text meaning is semantically much more ambiguous in comparison with standard text. To recognize a text of graphic documents, it is necessary first to separate it from linear objects, solids, and symbols and to define its orientation. Even so, the recognition programs nearly always produce errors. In the context of raster-to-vector conversion of graphic documents, the problem of text recognition is of special interest, because textual information can be used for verification of vectorization results (post-processing). In this work, we propose a method that combines OCR-based text recognition in raster-scanned maps with heuristics specially adapted for cartographic data to resolve the recognition ambiguities using, among other information sources, the spatial object relation-ships. Our goal is to form in the vector thematic layers geographically meaningful words correctly attached to the cartographic objects.
Url:
DOI: 10.1007/978-3-540-25977-0_7
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 000058
- to stream Istex, to step Curation: 000057
- to stream Istex, to step Checkpoint: 000D26
- to stream Main, to step Merge: 001542
- to stream PascalFrancis, to step Corpus: 000508
- to stream PascalFrancis, to step Curation: 000281
- to stream PascalFrancis, to step Checkpoint: 000463
- to stream Main, to step Merge: 001674
- to stream Main, to step Curation: 001491
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Resolving Ambiguities in Toponym Recognition in Cartographic Maps</title>
<author><name sortKey="Gelbukh, Alexander" sort="Gelbukh, Alexander" uniqKey="Gelbukh A" first="Alexander" last="Gelbukh">Alexander Gelbukh</name>
</author>
<author><name sortKey="Levachkine, Serguei" sort="Levachkine, Serguei" uniqKey="Levachkine S" first="Serguei" last="Levachkine">Serguei Levachkine</name>
</author>
<author><name sortKey="Han, Sang Yong" sort="Han, Sang Yong" uniqKey="Han S" first="Sang-Yong" last="Han">Sang-Yong Han</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:D1AFFF8F50C2DA0B2C1F657E21ABB74DB5B39CBE</idno>
<date when="2004" year="2004">2004</date>
<idno type="doi">10.1007/978-3-540-25977-0_7</idno>
<idno type="url">https://api.istex.fr/document/D1AFFF8F50C2DA0B2C1F657E21ABB74DB5B39CBE/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000058</idno>
<idno type="wicri:Area/Istex/Curation">000057</idno>
<idno type="wicri:Area/Istex/Checkpoint">000D26</idno>
<idno type="wicri:doubleKey">0302-9743:2004:Gelbukh A:resolving:ambiguities:in</idno>
<idno type="wicri:Area/Main/Merge">001542</idno>
<idno type="wicri:source">INIST</idno>
<idno type="RBID">Pascal:04-0543582</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000508</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000281</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000463</idno>
<idno type="wicri:doubleKey">0302-9743:2004:Gelbukh A:resolving:ambiguities:in</idno>
<idno type="wicri:Area/Main/Merge">001674</idno>
<idno type="wicri:Area/Main/Curation">001491</idno>
<idno type="wicri:Area/Main/Exploration">001491</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Resolving Ambiguities in Toponym Recognition in Cartographic Maps</title>
<author><name sortKey="Gelbukh, Alexander" sort="Gelbukh, Alexander" uniqKey="Gelbukh A" first="Alexander" last="Gelbukh">Alexander Gelbukh</name>
<affiliation><wicri:noCountry code="no comma">Natural Language Processing Lab</wicri:noCountry>
</affiliation>
<affiliation wicri:level="1"><country xml:lang="fr">Corée du Sud</country>
<wicri:regionArea>Computer Science and Engineering Department, Chung-Ang University</wicri:regionArea>
<wicri:noRegion>Chung-Ang University</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Mexique</country>
</affiliation>
</author>
<author><name sortKey="Levachkine, Serguei" sort="Levachkine, Serguei" uniqKey="Levachkine S" first="Serguei" last="Levachkine">Serguei Levachkine</name>
<affiliation><wicri:noCountry code="subField">(IPN)</wicri:noCountry>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Mexique</country>
</affiliation>
</author>
<author><name sortKey="Han, Sang Yong" sort="Han, Sang Yong" uniqKey="Han S" first="Sang-Yong" last="Han">Sang-Yong Han</name>
<affiliation wicri:level="1"><country xml:lang="fr">Corée du Sud</country>
<wicri:regionArea>Computer Science and Engineering Department, Chung-Ang University</wicri:regionArea>
<wicri:noRegion>Chung-Ang University</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Corée du Sud</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2004</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">D1AFFF8F50C2DA0B2C1F657E21ABB74DB5B39CBE</idno>
<idno type="DOI">10.1007/978-3-540-25977-0_7</idno>
<idno type="ChapterID">7</idno>
<idno type="ChapterID">Chap7</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Ambiguity</term>
<term>Automatic programming</term>
<term>Automatic recognition</term>
<term>Cartography</term>
<term>Character recognition</term>
<term>Graphic document</term>
<term>Graphical recognition</term>
<term>Graphics</term>
<term>Graphism</term>
<term>Heuristic method</term>
<term>Image processing</term>
<term>Information source</term>
<term>Information use</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
<term>Raster</term>
<term>Text</term>
<term>Textual data</term>
<term>Vectorization</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Ambiguité</term>
<term>Cartographie</term>
<term>Document graphique</term>
<term>Donnée textuelle</term>
<term>Graphisme</term>
<term>Méthode heuristique</term>
<term>Programmation automatique</term>
<term>Reconnaissance automatique</term>
<term>Reconnaissance caractère</term>
<term>Reconnaissance forme</term>
<term>Reconnaissance graphique</term>
<term>Reconnaissance optique caractère</term>
<term>Représentation graphique</term>
<term>Source information</term>
<term>Texte</term>
<term>Traitement image</term>
<term>Trame</term>
<term>Utilisation information</term>
<term>Vectorisation</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Cartographie</term>
</keywords>
</textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: To date many methods and programs for automatic text recognition exist. However there are no effective text recognition systems for graphic documents. Graphic documents usually contain a great variety of textual information. As a rule the text appears in arbitrary spatial positions, in different fonts, sizes and colors. The text can touch and overlap graphic symbols. The text meaning is semantically much more ambiguous in comparison with standard text. To recognize a text of graphic documents, it is necessary first to separate it from linear objects, solids, and symbols and to define its orientation. Even so, the recognition programs nearly always produce errors. In the context of raster-to-vector conversion of graphic documents, the problem of text recognition is of special interest, because textual information can be used for verification of vectorization results (post-processing). In this work, we propose a method that combines OCR-based text recognition in raster-scanned maps with heuristics specially adapted for cartographic data to resolve the recognition ambiguities using, among other information sources, the spatial object relation-ships. Our goal is to form in the vector thematic layers geographically meaningful words correctly attached to the cartographic objects.</div>
</front>
</TEI>
<affiliations><list><country><li>Corée du Sud</li>
<li>Mexique</li>
</country>
</list>
<tree><country name="Corée du Sud"><noRegion><name sortKey="Gelbukh, Alexander" sort="Gelbukh, Alexander" uniqKey="Gelbukh A" first="Alexander" last="Gelbukh">Alexander Gelbukh</name>
</noRegion>
<name sortKey="Han, Sang Yong" sort="Han, Sang Yong" uniqKey="Han S" first="Sang-Yong" last="Han">Sang-Yong Han</name>
<name sortKey="Han, Sang Yong" sort="Han, Sang Yong" uniqKey="Han S" first="Sang-Yong" last="Han">Sang-Yong Han</name>
</country>
<country name="Mexique"><noRegion><name sortKey="Gelbukh, Alexander" sort="Gelbukh, Alexander" uniqKey="Gelbukh A" first="Alexander" last="Gelbukh">Alexander Gelbukh</name>
</noRegion>
<name sortKey="Levachkine, Serguei" sort="Levachkine, Serguei" uniqKey="Levachkine S" first="Serguei" last="Levachkine">Serguei Levachkine</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001491 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001491 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:D1AFFF8F50C2DA0B2C1F657E21ABB74DB5B39CBE |texte= Resolving Ambiguities in Toponym Recognition in Cartographic Maps }}
This area was generated with Dilib version V0.6.32. |